{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 在 GPU 上使用量桨\n",
    "\n",
    "<em> Copyright (c) 2021 Institute for Quantum Computing, Baidu Inc. All Rights Reserved. </em>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 简介\n",
    "\n",
    "> 注意，本篇教程具有时效性。同时不同电脑也会有个体差异性，本篇教程不保证所有电脑可以安装成功。\n",
    "\n",
    "在深度学习中，大家通常会使用 GPU 来进行神经网络模型的训练，因为与 CPU 相比，GPU在浮点数运算方面有着显著的优势。因此，使用 GPU 来训练神经网络模型逐渐成为共同的选择。在 Paddle Quantum 中，我们的量子态和量子门也采用基于浮点数的复数表示，因此我们的模型如果能部署到 GPU 上进行训练，也会显著提升训练速度。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## GPU 选择\n",
    "\n",
    "在这里，我们选择 Nvidia 的硬件设备，其 CUDA（Compute Unified Device Architecture）对深度学习的框架支持比较好，PaddlePaddle 也可以比较方便地安装在 CUDA 上。\n",
    "\n",
    "## 配置 CUDA 环境\n",
    "\n",
    "### 安装 CUDA\n",
    "\n",
    "这里，我们介绍如何在 x64 平台上的 Windows10 系统中配置 CUDA 环境。首先，在[CUDA GPUs | NVIDIA Developer](https://developer.nvidia.com/cuda-gpus)上查看你的GPU是否可以安装CUDA环境。然后，在[NVIDIA 驱动程序下载](https://www.nvidia.cn/Download/index.aspx?lang=cn)下载你的显卡的最新版驱动，并安装到电脑上。\n",
    "\n",
    "在[飞桨的安装步骤](https://www.paddlepaddle.org.cn/install/quick)中，我们发现，**PaddlePaddle 的 GPU版本支持CUDA 9.0/10.0/10.1/10.2/11.0，且仅支持单卡**。所以这里，我们安装 CUDA10.2。在[CUDA Toolkit Archive | NVIDIA Developer](https://developer.nvidia.com/cuda-toolkit-archive)找到 CUDA 10.2 的下载地址：[CUDA Toolkit 10.2 Archive | NVIDIA Developer](https://developer.nvidia.com/cuda-10.2-download-archive)，下载CUDA后，运行安装。\n",
    "\n",
    "在安装过程中，选择**自定义安装**，在 CUDA 选项中，勾选除 Visual Studio Intergration 外的其他内容（除非你理解 Visual Studio Intergration 的作用），然后除 CUDA 之外，其他选项均不勾选。然后安装位置选择默认位置（请留意你的 CUDA 的安装位置，后面需要设置环境变量），等待安装完成。\n",
    "\n",
    "安装完成之后，打开 Windows 命令行，输入`nvcc -V`，如果看到版本信息，则说明 CUDA 安装成功。\n",
    "\n",
    "### 安装 cuDNN\n",
    "\n",
    "在[NVIDIA cuDNN | NVIDIA Developer](https://developer.nvidia.com/cudnn)下载 cuDNN，根据[飞桨的安装步骤](https://www.paddlepaddle.org.cn/install/quick)中的要求，我们**需要使用 cuDNN 7.6.5+** ，因此我们下载支持 CUDA 10.2 的 7.6.5 版 cuDNN 即可。下载完成 cuDNN 后进行解压缩。然后，假设我们的 CUDA 的安装路径为`C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v10.2`，我们将 cuDNN 解压缩后里面的`bin`、`include`和`lib`中的文件都替换 CUDA 的安装路径下的对应文件（如果文件已存在则进行替换，如果未存在则直接粘贴到对应目录中）。到这里，cuDNN 也就安装完成了。\n",
    "\n",
    "\n",
    "### 配置环境变量\n",
    "\n",
    "接下来还需要配置环境变量。右键电脑桌面上的“此电脑”（或“文件资源管理器”左栏的“此电脑”），选择“属性”，然后选择左侧的“高级系统设置”，在“高级”这一栏下选择“环境变量”。\n",
    "\n",
    "现在就进入到了环境变量的设置页面，在系统变量中选择`Path`，点击“编辑”。在出现的页面中，查看是否有`C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v10.2\\bin`和`C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v10.2\\libnvvp`这两个地址（其前缀`C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v10.2`应该为你的 CUDA 的安装位置），如果没有，请手动添加。\n",
    "\n",
    "### 验证是否安装成功\n",
    "\n",
    "打开命令行，输入`cd C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v10.2\\extras\\demo_suite`进入到 CUDA 安装路径（这里也应该为你的 CUDA 的安装位置）。然后分别执行`.\\bandwidthTest.exe`和`.\\deviceQuery.exe`，如果都出现`Result = PASS`，则说明安装成功。\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 在 CUDA 环境上安装 PaddlePaddle\n",
    "\n",
    "根据[飞桨的安装步骤](https://www.paddlepaddle.org.cn/install/quick)中的说明，我们首先需要确定自己的python环境，用`python --version`来查看 python 版本，保证**python版本是3.5.1+/3.6+/3.7/3.8**，并且用`python -m ensurepip`和`python -m pip --version`来查看 pip 版本，**确认是 20.2.2+**。然后，使用`python -m pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple`来安装 GPU 版本的 PaddlePaddle。\n",
    "\n",
    "## 安装 Paddle Quantum\n",
    "\n",
    "下载 Paddle Quantum 的安装包，修改`setup.py`和`requirements.txt`，将其中的`paddlepaddle`改为`paddlepaddle-gpu`，然后按照 Paddle Quantum 的从源代码安装的要求，执行`pip install -e .`即可。\n",
    "\n",
    "> 如果你是在一个新的 python 环境中安装了 paddlepaddle-gpu 和 paddle_quantum，请在新的 python 环境中安装 jupyter，并在新的 jupyter 下重新打开本教程并运行。\n",
    "\n",
    "## 检测是否安装成功\n",
    "\n",
    "打开我们 GPU 版本的 PaddlePaddle 环境，执行下面的命令，若输出为`True`则表示当前 PaddlePaddle 框架可以在GPU上运行。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "True\n"
     ]
    }
   ],
   "source": [
    "import paddle\n",
    "print(paddle.is_compiled_with_cuda())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 使用教程和示例\n",
    "\n",
    "在 Paddle Quantum 中，我们使用动态图机制来定义和训练我们的参数化量子线路。在这里，我们依然使用动态图机制，只需要定义动态图机制的运行设备即可。方式如下："
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```python\n",
    "# 0 表示使用编号为0的GPU\n",
    "paddle.set_device(\"gpu:0\")\n",
    "# build and train your quantum circuit model\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "当我们想在 CPU 上运行时，也采用类似的方式，定义运行设备为CPU："
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```python\n",
    "paddle.set_device(\"cpu\")\n",
    "# build and train your quantum circuit model\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们可以在命令行中输入`nvidia-smi`来查看 GPU 的使用情况，包括有哪些程序在哪些 GPU 上运行，以及其显存占用情况。\n",
    "\n",
    "这里，我们以 [VQE](../tutorial/quantum_simulation/VQE_CN.ipynb) 为例来说明我们该如何使用 GPU。首先，导入相关的包并定义相关的变量和函数。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import numpy\n",
    "from numpy import concatenate\n",
    "from numpy import pi as PI\n",
    "from numpy import savez, zeros\n",
    "from paddle import matmul, transpose\n",
    "from paddle_quantum.circuit import UAnsatz\n",
    "\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy\n",
    "\n",
    "\n",
    "def H2_generator():\n",
    "\n",
    "    H = [\n",
    "        [-0.04207897647782277, 'i0'],\n",
    "        [0.17771287465139946, 'z0'],\n",
    "        [0.1777128746513994, 'z1'],\n",
    "        [-0.2427428051314046, 'z2'],\n",
    "        [-0.24274280513140462, 'z3'],\n",
    "        [0.17059738328801055, 'z0,z1'],\n",
    "        [0.04475014401535163, 'y0,x1,x2,y3'],\n",
    "        [-0.04475014401535163, 'y0,y1,x2,x3'],\n",
    "        [-0.04475014401535163, 'x0,x1,y2,y3'],\n",
    "        [0.04475014401535163, 'x0,y1,y2,x3'],\n",
    "        [0.12293305056183797, 'z0,z2'],\n",
    "        [0.1676831945771896, 'z0,z3'],\n",
    "        [0.1676831945771896, 'z1,z2'],\n",
    "        [0.12293305056183797, 'z1,z3'],\n",
    "        [0.1762764080431959, 'z2,z3']\n",
    "        ]\n",
    "    N = 4\n",
    "    \n",
    "    return H, N\n",
    "\n",
    "\n",
    "Hamiltonian, N = H2_generator()\n",
    "\n",
    "\n",
    "def U_theta(theta, Hamiltonian, N, D):\n",
    "    \"\"\"\n",
    "    Quantum Neural Network\n",
    "    \"\"\"\n",
    "\n",
    "    # 按照量子比特数量/网络宽度初始化量子神经网络\n",
    "    cir = UAnsatz(N)\n",
    "\n",
    "    # 内置的 {R_y + CNOT} 电路模板\n",
    "    cir.real_entangled_layer(theta[:D], D)\n",
    "\n",
    "    # 铺上最后一列 R_y 旋转门\n",
    "    for i in range(N):\n",
    "        cir.ry(theta=theta[D][i][0], which_qubit=i)\n",
    "\n",
    "    # 量子神经网络作用在默认的初始态 |0000>上\n",
    "    cir.run_state_vector()\n",
    "\n",
    "    # 计算给定 Hamilton 量的期望值\n",
    "    expectation_val = cir.expecval(Hamiltonian)\n",
    "\n",
    "    return expectation_val\n",
    "\n",
    "\n",
    "class StateNet(paddle.nn.Layer):\n",
    "    \"\"\"\n",
    "    Construct the model net\n",
    "    \"\"\"\n",
    "\n",
    "    def __init__(self, shape, dtype=\"float64\"):\n",
    "        super(StateNet, self).__init__()\n",
    "\n",
    "        # 初始化 theta 参数列表，并用 [0, 2*pi] 的均匀分布来填充初始值\n",
    "        self.theta = self.create_parameter(\n",
    "            shape=shape,\n",
    "            default_initializer=paddle.nn.initializer.Uniform(low=0.0, high=2 * PI), \n",
    "            dtype=dtype, \n",
    "            is_bias=False)\n",
    "\n",
    "    # 定义损失函数和前向传播机制\n",
    "    def forward(self, Hamiltonian, N, D):\n",
    "        # 计算损失函数/期望值\n",
    "        loss = U_theta(self.theta, Hamiltonian, N, D)\n",
    "\n",
    "        return loss\n",
    "\n",
    "ITR = 80 # 设置训练的总迭代次数\n",
    "LR = 0.2 # 设置学习速率\n",
    "D = 2    # 设置量⼦神经⽹络中重复计算模块的深度 Depth"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "如果要使用GPU训练，则运行下面的程序："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "iter: 20 loss: -0.9686\n",
      "iter: 20 Ground state energy: -0.9686 Ha\n",
      "iter: 40 loss: -1.1056\n",
      "iter: 40 Ground state energy: -1.1056 Ha\n",
      "iter: 60 loss: -1.1235\n",
      "iter: 60 Ground state energy: -1.1235 Ha\n",
      "iter: 80 loss: -1.1317\n",
      "iter: 80 Ground state energy: -1.1317 Ha\n"
     ]
    }
   ],
   "source": [
    "# 0 表示使用编号为0的GPU\n",
    "paddle.set_device(\"gpu:0\")\n",
    "  \n",
    "# 确定网络的参数维度\n",
    "net = StateNet(shape=[D + 1, N, 1])\n",
    "\n",
    "# 一般来说，我们利用Adam优化器来获得相对好的收敛\n",
    "# 当然你可以改成SGD或者是RMSProp\n",
    "opt = paddle.optimizer.Adam(learning_rate=LR, parameters=net.parameters())\n",
    "\n",
    "# 记录优化结果\n",
    "summary_iter, summary_loss = [], []\n",
    "\n",
    "# 优化循环\n",
    "for itr in range(1, ITR + 1):\n",
    "\n",
    "    # 前向传播计算损失函数\n",
    "    loss = net(Hamiltonian, N, D)\n",
    "\n",
    "    # 在动态图机制下，反向传播极小化损失函数\n",
    "    loss.backward()\n",
    "    opt.minimize(loss)\n",
    "    opt.clear_grad()\n",
    "\n",
    "    # 更新优化结果\n",
    "    summary_loss.append(loss.numpy())\n",
    "    summary_iter.append(itr)\n",
    "\n",
    "    # 打印结果\n",
    "    if itr % 20 == 0:\n",
    "        print(\"iter:\", itr, \"loss:\", \"%.4f\" % loss.numpy())\n",
    "        print(\"iter:\", itr, \"Ground state energy:\", \n",
    "              \"%.4f Ha\" % loss.numpy())\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "如果要使用CPU训练，则运行下面的程序："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "iter: 20 loss: -1.1080\n",
      "iter: 20 Ground state energy: -1.1080 Ha\n",
      "iter: 40 loss: -1.1301\n",
      "iter: 40 Ground state energy: -1.1301 Ha\n",
      "iter: 60 loss: -1.1357\n",
      "iter: 60 Ground state energy: -1.1357 Ha\n",
      "iter: 80 loss: -1.1361\n",
      "iter: 80 Ground state energy: -1.1361 Ha\n"
     ]
    }
   ],
   "source": [
    "# 表示使用 CPU\n",
    "paddle.set_device(\"cpu\")\n",
    "  \n",
    "# 确定网络的参数维度\n",
    "net = StateNet(shape=[D + 1, N, 1])\n",
    "\n",
    "# 一般来说，我们利用Adam优化器来获得相对好的收敛\n",
    "# 当然你可以改成SGD或者是RMSProp\n",
    "opt = paddle.optimizer.Adam(learning_rate=LR, parameters=net.parameters())\n",
    "\n",
    "# 记录优化结果\n",
    "summary_iter, summary_loss = [], []\n",
    "\n",
    "# 优化循环\n",
    "for itr in range(1, ITR + 1):\n",
    "\n",
    "    # 前向传播计算损失函数\n",
    "    loss = net(Hamiltonian, N, D)\n",
    "\n",
    "    # 在动态图机制下，反向传播极小化损失函数\n",
    "    loss.backward()\n",
    "    opt.minimize(loss)\n",
    "    opt.clear_grad()\n",
    "\n",
    "    # 更新优化结果\n",
    "    summary_loss.append(loss.numpy())\n",
    "    summary_iter.append(itr)\n",
    "\n",
    "    # 打印结果\n",
    "    if itr % 20 == 0:\n",
    "        print(\"iter:\", itr, \"loss:\", \"%.4f\" % loss.numpy())\n",
    "        print(\"iter:\", itr, \"Ground state energy:\", \n",
    "              \"%.4f Ha\" % loss.numpy())\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 总结\n",
    "\n",
    "按照我们的测试，现在版本的 paddle_quantum 可以在 GPU 下运行，但是需要比较好的 GPU 资源才能体现出足够的加速效果。在未来的版本中，我们也会不断优化 paddle_quantum 在 GPU 下的性能表现，敬请期待。\n",
    "\n",
    "_______\n",
    "\n",
    "## 参考资料\n",
    "\n",
    "[1] [Installation Guide Windows :: CUDA Toolkit Documentation](https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html)\n",
    "\n",
    "[2] [Installation Guide :: NVIDIA Deep Learning cuDNN Documentation](https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#installwindows)\n",
    "\n",
    "[3] [开始使用_飞桨-源于产业实践的开源深度学习平台](https://www.paddlepaddle.org.cn/install/quick)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.3"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
